merge upstream by overcuriousity · Pull Request #622 · mostlygeek/llama-swap

overcuriousity · 2026-04-01T09:52:42Z

No description provided.

Fix parseEndpointList to handle single and double quotes that are treated as literal characters on Windows. - Strip surrounding quotes before parsing comma-separated endpoints - Fixes test failures on Windows CI Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

RPC health checking now runs continuously from process creation until proxy shutdown, completely independent of whether the model is loaded, starting, stopped, or in any other state. - Start health checker in NewProcess when rpcHealthCheck is enabled - Remove stopRPCHealthChecker - only stops on proxy shutdown - Remove state checks from health checker goroutine - Health status always reflects current RPC endpoint availability Previously, the health checker only ran while a process was in StateReady, causing stale health data when processes stopped. Now /v1/models always shows accurate RPC health regardless of model state. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

Work in progress on web configuration feature. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

The requestTimeout feature was not working because the timeout context was not connected to the HTTP request. When the timeout fired, it attempted to kill the process but the reverse proxy continued waiting for a response indefinitely. - Use context.WithTimeout() to create a timeout context for the HTTP request - Clone the request with the timeout context before proxying - When timeout fires, the HTTP request is immediately cancelled - Fix StopImmediately() to handle timeouts during model loading (StateStarting) - Add unit test to verify timeout behavior Before: requests would run for 60+ seconds despite requestTimeout: 20 After: requests terminate in exactly 20 seconds as configured Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

Add brief mention of requestTimeout feature in the customizable features section of README. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

Feat web config

Feat timeout

…thcheck Feat conditional rpc healthcheck

Adjust RPC health check parameters to reduce false positives when endpoints are under load and fix multiple security/correctness issues. - increase RPC health check timeout from 500ms to 3s to handle busy servers - decrease check interval from 30s to 10s for faster detection - fix process_timeout_test missing context parameter - fix config_embed to return cloned byte slice preventing mutation - update ui dependencies: js-yaml 4.1.1, @codemirror/state 6.5.4, @codemirror/lang-yaml 6.1.2 Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

Fix editor cleanup and improve dark mode appearance with better colors, contrast, and styling. - Add proper editor disposal in $effect cleanup - Update theme colors for better dark mode visibility - Improve button styling with teal export button - Better text contrast and subtle borders - Refine error message styling Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

Prevent stopCommand() from hanging when a timeout transitions StateStarting to StateStopping before cmd.Start() completes. - Close cmdWaitChan when cmd.Start() fails - Add guard in stopCommand() to skip wait if process never started - Add tests for hang scenarios - Fix TestProcess_RequestTimeout to avoid calling t.Fatal from handler goroutine Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

Fix ParseRPCEndpoints to handle Windows shlex behavior where single-quoted strings with spaces are split into multiple arguments. - Collect all non-flag arguments after --rpc flag - Join them with space before parsing endpoint list - Fixes test failure: TestParseRPCEndpoints_ValidFormats/multiple_endpoints_with_spaces_trimmed Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

Fix data race between Process.start() and Process.stopCommand() by using a cmdStarted flag instead of checking cmd.Process directly. Also fix hang when StopImmediately() is called during startup. Changes: - Add cmdStarted bool flag to track if Start() completed successfully - Initialize cancelUpstream and cmdWaitChan before Start() so stopCommand() can access them during startup - Always call cancelUpstream() in stopCommand() to cancel context, even if Start() hasn't completed - Only wait on cmdWaitChan if cmdStarted is true to avoid hanging - Reset cmdStarted=false when process exits - Change cancelUpstream==nil error to debug message (expected in tests using forceState()) - Use platform-appropriate sleep command in tests (timeout on Windows, sleep on Unix) Fixes race condition and Windows test failures. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

Fix theme compartment sharing bug and improve error response handling. - create separate Compartment instances for each CodeMirror editor - update createEditor to accept compartment parameter - improve saveConfig error handling to parse both JSON and non-JSON responses - include status code and statusText in error messages

Fix race condition in StopImmediately where state changes between CurrentState() and swapState() could cause stop to abort. Add retry loop to handle ErrExpectedStateMismatch. Fix test race condition where assert.Error was called inside goroutine. Move error assertion to main test goroutine using buffered channel. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

Fix getSleepCommand to use full path to timeout.exe on Windows. This avoids conflict with GNU coreutils timeout command in Git Bash environments on CI. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

Separate request context from timeout monitoring context to avoid misattributing parent-imposed deadlines. Create monitoring context from context.Background() to reliably detect configured timeout, while maintaining request timeout for proper cancellation. - requestCtx: with timeout for request cancellation - timeoutCtx: from Background() for monitoring only - prevents false positives from parent context deadlines Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

Feat web config

Feat timeout

…thcheck Feat conditional rpc healthcheck

I/O timeout errors now don't mark RPC endpoints as unhealthy. Timeouts are logged at debug level and health state is preserved. - detect timeout errors using net.Error.Timeout() - add test for timeout handling behavior Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

…thcheck proxy: ignore I/O timeout in RPC health checks

Fix cursor jumping to top after typing by preventing reactive effect from re-running on content changes. Use untrack() to read config state without creating reactive dependency, ensuring editor is only created once and not destroyed/recreated on each keystroke. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

ui-svelte: fix Config editor cursor jumping on input

merge upstream

Feat timeout

Resolved conflicts: - Merged both Playground and Config features in App.svelte - Combined dependencies from both branches in package.json - Removed old ui/ files (replaced by ui-svelte/) - Regenerated package-lock.json with merged dependencies

…on changes

Merge latest upstream changes including: - global TTL feature - setParamsByID filter - request/response capturing - playground UI (chat, images, speech, transcription, rerank) - incremental streaming markdown - infill timings support - in-flight request tracking - smart auto-scroll in LogPanel - copy button for code blocks - OGG audio format support - cuda13 architecture support Conflict resolutions: - proxy/process_test.go: kept context.Background() param + upstream var rename - proxy/proxymanager_api.go: kept both config editor and captures APIs - ui-svelte/package.json: kept both codemirror and markdown/katex deps - ui-svelte/src/App.svelte: kept both Config and Playground routes - ui-svelte/src/components/Header.svelte: fixed $location to $currentRoute - ui/package*: accepted upstream deletion of old React UI

Merge upstream changes (globalTTL, captures, playground, streaming markdown, rerank, infill timings, in-flight tracking, setParamsByID) with fork features (config editor, timeout handling, RPC health checks, context parameter). - resolve App.svelte conflict: keep Config + PlaygroundStub routes

coderabbitai · 2026-04-01T09:52:56Z

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 62bedcfd-472d-4ede-afb1-a6d3e5403c35

📥 Commits

Reviewing files that changed from the base of the PR and between 9559009 and e295d15.

⛔ Files ignored due to path filters (1)

ui-svelte/package-lock.json is excluded by !**/package-lock.json

📒 Files selected for processing (22)

Makefile
README.md
config-schema.json
config.example.yaml
config_embed.go
docs/configuration.md
llama-swap.go
proxy/config/config.go
proxy/config/config_test.go
proxy/config/model_config.go
proxy/process.go
proxy/process_rpc_health_test.go
proxy/process_test.go
proxy/process_timeout_test.go
proxy/processgroup.go
proxy/processgroup_test.go
proxy/proxymanager.go
proxy/proxymanager_api.go
ui-svelte/package.json
ui-svelte/src/App.svelte
ui-svelte/src/components/Header.svelte
ui-svelte/src/routes/Config.svelte

Walkthrough

The pull request introduces request timeout enforcement for model inference, adds RPC health checking for distributed inference models with TCP endpoint monitoring, implements configuration API endpoints, replaces the React UI with a new Svelte-based UI with configuration editor, updates the Makefile to build the Svelte UI, and refactors configuration hot-reloading in the main application.

Changes

Cohort / File(s)	Summary
Build System & Config Metadata `Makefile`, `config_embed.go`	Switched UI build target from `ui/` to `ui-svelte/`, updated node_modules caching. Added Go embed of `config.example.yaml` with public accessor function.
Configuration Schema & Documentation `config-schema.json`, `config.example.yaml`, `docs/configuration.md`, `README.md`	Added two new per-model config properties: `requestTimeout` (integer, enforces request deadline) and `rpcHealthCheck` (boolean, enables RPC endpoint health monitoring). Updated example configs and documentation accordingly.
Core App & Config Parsing `llama-swap.go`, `proxy/config/config.go`, `proxy/config/config_test.go`, `proxy/config/model_config.go`	Refactored config change event handling to always listen (not conditionally deferred), added ProxyManager config path/example setters. Implemented `ParseRPCEndpoints()` parser with validation for RPC endpoint extraction from command strings. Added `requestTimeout` and `rpcHealthCheck` fields to ModelConfig.
Process Lifecycle & Concurrency `proxy/process.go`, `proxy/process_test.go`, `proxy/process_rpc_health_test.go`, `proxy/process_timeout_test.go`	Added startup tracking (`cmdStarted` flag) to prevent hangs when command start fails. Implemented per-request timeout enforcement with background monitoring and forced process termination. Added RPC health checking with background ticker-based TCP health monitoring, initial unhealthy state, and `IsRPCHealthy()` accessor. Improved `StopImmediately()` state transition robustness and `stopCommand()` error handling.
Process Group & ProxyManager `proxy/processgroup.go`, `proxy/processgroup_test.go`, `proxy/proxymanager.go`, `proxy/proxymanager_api.go`	Threaded `shutdownCtx` through ProcessGroup into each Process. Added RPC health filtering to `/v1/models` endpoint (excludes unhealthy models). Return HTTP 503 for inference on unhealthy RPC models. Introduced new API endpoints: `GET /config/current`, `GET /config/example`, `POST /config` for configuration management.
Svelte UI with Config Editor `ui-svelte/package.json`, `ui-svelte/src/App.svelte`, `ui-svelte/src/components/Header.svelte`, `ui-svelte/src/routes/Config.svelte`	Added CodeMirror and yaml parsing dependencies. Registered new `/config` route with navigation link in header. Implemented `Config.svelte` component: dual CodeMirror editors (editable current config, read-only example), YAML validation, load/save/import/export functionality, dynamic dark-mode theming via Compartments.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

ui: add Svelte port of React UI #487: Adds complete ui-svelte Svelte UI with package configuration and updates Makefile to use ui-svelte/ build targets, directly parallel to this PR's UI switch.
ui-svelte: add Playground page with chat interface #497: Modifies the same Svelte UI files (App.svelte, Header.svelte, package.json) to add additional routes/components.
feat: config hot-reload #106: Implements configuration hot-reload with overlapping changes to llama-swap.go server logic and ProxyManager/Process handling.

Suggested labels

enhancement

Suggested reviewers

mostlygeek

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Copilot

Pull request overview

This PR merges upstream changes that add a web-based configuration editor to the Svelte UI, plus new backend support for reading/writing the running config file, and introduces two new model-level operational controls: RPC endpoint health gating and per-request timeouts.

Changes:

Add /config UI route with a CodeMirror YAML editor (current config editable + embedded example config read-only), including import/export and client-side YAML validation.
Add authenticated backend API endpoints to fetch the current config file, fetch the embedded example config, and update the config (triggering a reload event).
Introduce rpcHealthCheck (filter unhealthy RPC models from /v1/models and return 503 on inference) and requestTimeout (kill model process if a request exceeds a configured duration), with accompanying schema/docs/tests.

Reviewed changes

Copilot reviewed 22 out of 23 changed files in this pull request and generated 8 comments.

Show a summary per file

File	Description
ui-svelte/src/routes/Config.svelte	New configuration editor page (CodeMirror + YAML validation + import/export/save).
ui-svelte/src/components/Header.svelte	Add navigation link to the new `/config` route.
ui-svelte/src/App.svelte	Register `/config` route in the SPA router.
ui-svelte/package.json	Add CodeMirror + js-yaml dependencies needed for the config editor.
ui-svelte/package-lock.json	Lockfile updates for the new UI dependencies.
README.md	Update feature list to mention request timeout protection and RPC health checking.
proxy/proxymanager.go	Pass shutdown context into process groups; filter `/v1/models` based on RPC health; add setters for config path/example.
proxy/proxymanager_api.go	Add `/api/config/*` endpoints to read/write config and trigger reload events.
proxy/processgroup.go	Thread shutdown context through to each `Process` instance.
proxy/processgroup_test.go	Update tests for new `NewProcessGroup(..., shutdownCtx)` signature.
proxy/process.go	Implement `requestTimeout` enforcement and background RPC endpoint health checking; improve stop/start race handling.
proxy/process_timeout_test.go	Add tests around request timeout behavior.
proxy/process_test.go	Update tests for new `NewProcess(..., shutdownCtx)` signature and add stop/start edge-case tests.
proxy/process_rpc_health_test.go	Add tests for RPC health checking behavior.
proxy/config/model_config.go	Add `rpcHealthCheck` and `requestTimeout` fields to model config with defaults.
proxy/config/config.go	Add `ParseRPCEndpoints` helper for extracting `--rpc` endpoints from model commands.
proxy/config/config_test.go	Add unit tests for `ParseRPCEndpoints`.
Makefile	Update UI build/install targets to use the Svelte UI directory.
llama-swap.go	Persist config path/example into each newly created ProxyManager; always listen for config-change events to reload.
docs/configuration.md	Document new config options/features (including `requestTimeout` and `rpcHealthCheck`).
config.example.yaml	Add examples/documentation for `requestTimeout` and distributed RPC health checking.
config-schema.json	Extend JSON schema with `requestTimeout` and `rpcHealthCheck`.
config_embed.go	Embed `config.example.yaml` into the binary for the UI “Example Config” view.

Files not reviewed (1)

ui-svelte/package-lock.json: Language not supported

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-01T09:59:46Z

+	// Write to config file
+	if err := os.WriteFile(configPath, body, 0644); err != nil {


apiUpdateConfig writes the config with a fixed mode of 0644, which can unintentionally widen permissions (e.g., from 0600) and expose secrets such as API keys. Preserve the existing file mode (stat + write with same permissions) or default to a more restrictive mode (e.g., 0600) when creating the file.

Suggested change

// Write to config file

if err := os.WriteFile(configPath, body, 0644); err != nil {

// Determine file permissions: preserve existing mode if the file exists,

// otherwise use a restrictive default (0600).

var perm os.FileMode = 0600

if info, statErr := os.Stat(configPath); statErr == nil {

perm = info.Mode().Perm()

}

// Write to config file

if err := os.WriteFile(configPath, body, perm); err != nil {

Copilot · 2026-04-01T09:59:47Z

+		pm.sendErrorResponse(c, http.StatusBadRequest, "Config file path not set")
+		return
+	}
+


apiUpdateConfig is a state-changing endpoint reachable from browsers, but it doesn't include any CSRF mitigation (e.g., Origin/Referer validation or requiring a custom header like x-api-key). Because apiKeyAuth() accepts Basic auth (which browsers may automatically attach cross-site), a malicious site could potentially trigger a config overwrite if the user has authenticated. Consider rejecting requests with missing/mismatched Origin, or requiring x-api-key for mutating /api/config operations.

Suggested change

// Basic CSRF mitigation: require a custom header that browsers cannot send

// with simple cross-site form submissions or image requests.

if c.GetHeader("X-API-Key") == "" {

pm.sendErrorResponse(c, http.StatusForbidden, "Missing X-API-Key header")

return

}

Copilot · 2026-04-01T09:59:47Z

+
+	ctx, cancel := context.WithCancel(p.shutdownCtx)
+	p.rpcHealthCancel = cancel
+	p.rpcHealthTicker = time.NewTicker(10 * time.Second)


RPC health check interval is hard-coded to 10s (time.NewTicker(10 * time.Second)), but the docs/config examples added in this PR state 30 seconds. Either update the documentation to match, or make the interval configurable so behavior and docs stay consistent.

Suggested change

p.rpcHealthTicker = time.NewTicker(10 * time.Second)

p.rpcHealthTicker = time.NewTicker(30 * time.Second)

Copilot · 2026-04-01T09:59:47Z

+	allHealthy := true
+
+	for _, endpoint := range p.rpcEndpoints {
+		dialer := net.Dialer{Timeout: 3 * time.Second}
+		conn, err := dialer.Dial("tcp", endpoint)
+		if err != nil {
+			// Ignore I/O timeout errors - don't mark as unhealthy
+			if netErr, ok := err.(net.Error); ok && netErr.Timeout() {
+				p.proxyLogger.Debugf("<%s> RPC endpoint %s timeout (ignoring): %v", p.ID, endpoint, err)
+				continue
+			}
+			p.proxyLogger.Warnf("<%s> RPC endpoint %s unhealthy: %v", p.ID, endpoint, err)
+			allHealthy = false
+			break
+		}
+		conn.Close()
+	}
+
+	wasHealthy := p.rpcHealthy.Load()


checkRPCHealth() treats TCP dial timeouts as "ignore" and continues, which can result in allHealthy staying true and the model being considered healthy even when an endpoint is unreachable (a timeout is usually a strong signal of unhealthiness). If you want to avoid flapping, consider keeping the previous health state on timeout or marking the endpoint unhealthy after N consecutive timeouts, rather than counting timeouts as healthy.

Suggested change

allHealthy := true

for _, endpoint := range p.rpcEndpoints {

dialer := net.Dialer{Timeout: 3 * time.Second}

conn, err := dialer.Dial("tcp", endpoint)

if err != nil {

// Ignore I/O timeout errors - don't mark as unhealthy

if netErr, ok := err.(net.Error); ok && netErr.Timeout() {

p.proxyLogger.Debugf("<%s> RPC endpoint %s timeout (ignoring): %v", p.ID, endpoint, err)

continue

}

p.proxyLogger.Warnf("<%s> RPC endpoint %s unhealthy: %v", p.ID, endpoint, err)

allHealthy = false

break

}

conn.Close()

}

wasHealthy := p.rpcHealthy.Load()

// Start with the assumption that all endpoints are healthy; adjust based on checks.

allHealthy := true

wasHealthy := p.rpcHealthy.Load()

anyTimeout := false

anyFailure := false

for _, endpoint := range p.rpcEndpoints {

dialer := net.Dialer{Timeout: 3 * time.Second}

conn, err := dialer.Dial("tcp", endpoint)

if err != nil {

// For I/O timeout errors, record the timeout but don't immediately flip health.

if netErr, ok := err.(net.Error); ok && netErr.Timeout() {

p.proxyLogger.Debugf("<%s> RPC endpoint %s timeout (ignoring for health flip): %v", p.ID, endpoint, err)

anyTimeout = true

continue

}

// Non-timeout errors are treated as unhealthy.

p.proxyLogger.Warnf("<%s> RPC endpoint %s unhealthy: %v", p.ID, endpoint, err)

allHealthy = false

anyFailure = true

break

}

conn.Close()

}

// If there were only timeouts (no definite successes or failures), keep the previous state

// instead of forcing the health to healthy.

if anyFailure {

allHealthy = false

} else if anyTimeout && !anyFailure {

allHealthy = wasHealthy

} else {

// No failures and no timeouts: all endpoints responded successfully.

allHealthy = true

}

Copilot · 2026-04-01T09:59:48Z

+    # rpcHealthCheck: enable TCP health checks for RPC endpoints
+    # - optional, default: false
+    # - when enabled, parses --rpc host:port[,host:port,...] from cmd
+    # - performs TCP connectivity checks every 30 seconds


This comment says RPC connectivity checks run "every 30 seconds", but the implementation introduced in proxy/process.go runs the ticker every 10 seconds. Please align the example documentation with the actual behavior (or make the interval configurable).

Suggested change

# - performs TCP connectivity checks every 30 seconds

# - performs TCP connectivity checks every 10 seconds

Copilot · 2026-04-01T09:59:48Z

+                    "rpcHealthCheck": {
+                        "type": "boolean",
+                        "default": false,
+                        "description": "Enable TCP health checks for RPC endpoints specified in cmd. When enabled, parses --rpc host:port[,host:port,...] from cmd and performs health checks every 30 seconds. Models with unhealthy RPC endpoints are filtered from /v1/models and return 503 on inference requests."


Schema description says RPC health checks run "every 30 seconds", but the current implementation introduced in proxy/process.go uses a 10-second ticker. Please update the schema description (or the code) so they match.

Suggested change

"description": "Enable TCP health checks for RPC endpoints specified in cmd. When enabled, parses --rpc host:port[,host:port,...] from cmd and performs health checks every 30 seconds. Models with unhealthy RPC endpoints are filtered from /v1/models and return 503 on inference requests."

"description": "Enable TCP health checks for RPC endpoints specified in cmd. When enabled, parses --rpc host:port[,host:port,...] from cmd and performs health checks every 10 seconds. Models with unhealthy RPC endpoints are filtered from /v1/models and return 503 on inference requests."

Copilot · 2026-04-01T09:59:48Z

+// TestProcess_RequestTimeout verifies that requestTimeout actually kills the process
+func TestProcess_RequestTimeout(t *testing.T) {
+	// Create error channel to report handler errors from the mock server goroutine
+	srvErrCh := make(chan error, 1)
+
+	// Create a mock server that simulates a long-running inference
+	mockServer := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
+		t.Logf("Mock server received request")
+
+		// Simulate streaming response that takes 60 seconds
+		w.Header().Set("Content-Type", "text/event-stream")
+		w.WriteHeader(http.StatusOK)
+
+		flusher, ok := w.(http.Flusher)
+		if !ok {
+			srvErrCh <- fmt.Errorf("Expected http.ResponseWriter to be an http.Flusher")
+			return
+		}
+
+		// Stream data for 60 seconds
+		for i := 0; i < 60; i++ {
+			select {
+			case <-r.Context().Done():
+				t.Logf("Mock server: client disconnected after %d seconds", i)
+				return
+			default:
+				fmt.Fprintf(w, "data: token %d\n\n", i)
+				flusher.Flush()
+				time.Sleep(1 * time.Second)
+			}
+		}
+		t.Logf("Mock server completed full 60 second response")
+	}))
+	defer mockServer.Close()
+
+	// Setup process logger - use NewLogMonitor() to avoid race in test
+	processLogger := NewLogMonitor()
+	proxyLogger := NewLogMonitor()
+
+	// Create process with 5 second request timeout
+	cfg := config.ModelConfig{
+		Proxy:          mockServer.URL,
+		CheckEndpoint:  "none", // skip health check
+		RequestTimeout: 5,      // 5 second timeout
+	}
+
+	p := NewProcess("test-timeout", 30, cfg, processLogger, proxyLogger, context.Background())
+	p.gracefulStopTimeout = 2 * time.Second // shorter for testing
+
+	// Manually set state to ready (skip actual process start)
+	p.forceState(StateReady)
+


This test description says it verifies requestTimeout "kills the process", but the test forces the process into StateReady without starting an upstream command. In that setup StopImmediately() can't actually terminate a real process (and stopCommand() short-circuits when cancelUpstream is nil), so the test is really only validating request cancellation/timeout behavior. Consider renaming/rewording the test (or adjusting it) so the assertion matches what is actually exercised.

Copilot · 2026-04-01T09:59:48Z

+func TestProcess_RPCHealthCheckTimeoutIgnored(t *testing.T) {
+	testLogger := NewLogMonitorWriter(io.Discard)
+	proxyLogger := NewLogMonitorWriter(io.Discard)
+	ctx, cancel := context.WithCancel(context.Background())
+	defer cancel()
+
+	// Use an IP address that will timeout (non-routable IP)
+	// 192.0.2.0/24 is reserved for documentation/testing (RFC 5737)
+	modelConfig := config.ModelConfig{
+		Cmd:            "llama-server --rpc 192.0.2.1:50051",
+		Proxy:          "http://localhost:8080",
+		RPCHealthCheck: true,
+	}
+
+	process := NewProcess("test-model", 5, modelConfig, testLogger, proxyLogger, ctx)
+
+	// Verify endpoints were parsed
+	assert.NotEmpty(t, process.rpcEndpoints, "RPC endpoints should be parsed from cmd")
+	assert.Equal(t, []string{"192.0.2.1:50051"}, process.rpcEndpoints)
+
+	// Initially should be unhealthy (false) until first check
+	assert.False(t, process.rpcHealthy.Load(), "RPC health should start as false")
+
+	// Manually run health check - this should timeout but not mark as unhealthy
+	process.checkRPCHealth()
+
+	// After timeout, should remain at initial state (false) but not be marked unhealthy
+	// The key is that timeout doesn't change the state - it's effectively a no-op
+	// To test this properly, let's set it to healthy first, then see if timeout changes it
+	process.rpcHealthy.Store(true)
+	initialState := process.rpcHealthy.Load()
+	assert.True(t, initialState, "Should be healthy before timeout check")
+
+	// Run health check that will timeout
+	process.checkRPCHealth()
+
+	// After timeout, should still be healthy (timeout is ignored)
+	assert.True(t, process.rpcHealthy.Load(), "Should remain healthy after timeout")
+}


TestProcess_RPCHealthCheckTimeoutIgnored assumes dialing 192.0.2.1:50051 will reliably time out, but on some OS/network setups it may fail fast (e.g., "no route to host"), making the test flaky and environment-dependent. Consider refactoring RPC health checking to allow injecting a dial function (so you can deterministically return a timeout), or otherwise avoid asserting on timeout-specific behavior using real networking.

overcuriousity and others added 30 commits January 30, 2026 15:22

add timeout feature

3989c17

implement first draft of new feature

c34372c

fix unit test

ac074d1

rework web interface

c8f2761

fix error assumption healthy

6f023c7

WIP: web config changes

4987daf

Work in progress on web configuration feature. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

docs: add requestTimeout to README features list

0e86bbc

Add brief mention of requestTimeout feature in the customizable features section of README. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

Merge pull request #10 from overcuriousity/feat--web-config

97976a6

Feat web config

Merge pull request #11 from overcuriousity/feat--timeout

fc33fdf

Feat timeout

Merge branch 'new-features' into feat--conditional-rpc-healthcheck

88f02d7

Merge pull request #12 from overcuriousity/feat--conditional-rpc-heal…

7187493

…thcheck Feat conditional rpc healthcheck

remove test config file

7ca1977

Merge branch 'mostlygeek:main' into feat--web-config

9ab8bd8

Merge branch 'mostlygeek:main' into feat--conditional-rpc-healthcheck

e762485

Merge branch 'mostlygeek:main' into new-features

15a6aa7

proxy: fix Windows timeout command conflict

a502ebd

Fix getSleepCommand to use full path to timeout.exe on Windows. This avoids conflict with GNU coreutils timeout command in Git Bash environments on CI. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

Merge branch 'new-features' into feat--web-config

60f599b

Merge pull request #15 from overcuriousity/feat--web-config

04a8886

Feat web config

Merge pull request #14 from overcuriousity/feat--timeout

a7aa251

Feat timeout

overcuriousity and others added 16 commits January 31, 2026 20:18

Merge pull request #13 from overcuriousity/feat--conditional-rpc-heal…

f4fd37f

…thcheck Feat conditional rpc healthcheck

Merge branch 'mostlygeek:main' into feat--timeout

960e78d

Merge pull request #16 from overcuriousity/feat--conditional-rpc-heal…

79cf3df

…thcheck proxy: ignore I/O timeout in RPC health checks

Merge pull request #17 from overcuriousity/feat--web-config

ceeebbc

ui-svelte: fix Config editor cursor jumping on input

Merge branch 'mostlygeek:main' into feat--timeout

7d68a64

Merge pull request #21 from mostlygeek/main

256e576

merge upstream

Merge pull request #24 from overcuriousity/feat--timeout

5c9069d

Feat timeout

test: fix NewProcess calls to include context parameter

f5e13d2

Merge remote-tracking branch 'origin/feat--web-config'

f7b80b4

fix: correct node_modules path in Makefile (ui-svelte not ui)

842f966

fix: make node_modules depend on package.json to trigger npm install …

65761bd

…on changes

Copilot AI review requested due to automatic review settings April 1, 2026 09:52

overcuriousity closed this Apr 1, 2026

Copilot started reviewing on behalf of overcuriousity April 1, 2026 09:53 View session

Copilot AI reviewed Apr 1, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

merge upstream#622

merge upstream#622
overcuriousity wants to merge 46 commits intomostlygeek:mainfrom
overcuriousity:main

overcuriousity commented Apr 1, 2026

Uh oh!

coderabbitai bot commented Apr 1, 2026 •

edited

Loading

Review failed

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 1, 2026

Uh oh!

Copilot AI Apr 1, 2026

Uh oh!

Copilot AI Apr 1, 2026

Uh oh!

Copilot AI Apr 1, 2026

Uh oh!

Copilot AI Apr 1, 2026

Uh oh!

Copilot AI Apr 1, 2026

Uh oh!

Copilot AI Apr 1, 2026

Uh oh!

Copilot AI Apr 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		// Write to config file
		if err := os.WriteFile(configPath, body, 0644); err != nil {

-	// Write to config file
-	if err := os.WriteFile(configPath, body, 0644); err != nil {
+	// Determine file permissions: preserve existing mode if the file exists,
+	// otherwise use a restrictive default (0600).
+	var perm os.FileMode = 0600
+	if info, statErr := os.Stat(configPath); statErr == nil {
+		perm = info.Mode().Perm()
+	}
+	// Write to config file
+	if err := os.WriteFile(configPath, body, perm); err != nil {

+	// Basic CSRF mitigation: require a custom header that browsers cannot send
+	// with simple cross-site form submissions or image requests.
+	if c.GetHeader("X-API-Key") == "" {
+		pm.sendErrorResponse(c, http.StatusForbidden, "Missing X-API-Key header")
+		return
+	}

	p.rpcHealthTicker = time.NewTicker(10 * time.Second)
	p.rpcHealthTicker = time.NewTicker(30 * time.Second)

	# - performs TCP connectivity checks every 30 seconds
	# - performs TCP connectivity checks every 10 seconds

	"description": "Enable TCP health checks for RPC endpoints specified in cmd. When enabled, parses --rpc host:port[,host:port,...] from cmd and performs health checks every 30 seconds. Models with unhealthy RPC endpoints are filtered from /v1/models and return 503 on inference requests."
	"description": "Enable TCP health checks for RPC endpoints specified in cmd. When enabled, parses --rpc host:port[,host:port,...] from cmd and performs health checks every 10 seconds. Models with unhealthy RPC endpoints are filtered from /v1/models and return 503 on inference requests."

Conversation

overcuriousity commented Apr 1, 2026

Uh oh!

coderabbitai bot commented Apr 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review failed

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

coderabbitai bot commented Apr 1, 2026 •

edited

Loading